Log Likelihood, Part 1


To compute the log likelihood, we need to get the ratios and use them to compute a score that will allow us to decide whether a tweet is positive or negative. The higher the ratio, the more positive the word is:

To do inference, you can compute the following:

P(pos)P(neg)i=1mP(wipos)P(wineg)>1 \frac{P(p o s)}{P(n e g)} \prod_{i=1}^{m} \frac{P\left(w_{i} \mid p o s\right)}{P\left(w_{i} \mid n e g\right)} >1

As mm gets larger, we can get numerical flow issues, so we introduce the log\log, which gives you the following equation:

log(P(pos)P(neg)i=1nP(wipos)P(wineg))logP(pos)P(neg)+i=1nlogP(wipos)P(wineg) \log \left(\frac{P(p o s)}{P(n e g)} \prod_{i=1}^{n} \frac{P\left(w_{i} \mid p o s\right)}{P\left(w_{i} \mid n e g\right)}\right) \Rightarrow \log \frac{P(p o s)}{P(n e g)}+\sum_{i=1}^{n} \log \frac{P\left(w_{i} \mid p o s\right)}{P\left(w_{i} \mid n e g\right)}

The first component is called the log prior and the second component is the log likelihood. We further introduce λ\lambda as follows:

Having the λ\lambda dictionary will help a lot when doing inference.

 Complete